Learning Semantic Lexicons using Graph Mutual Reinforcement based Bootstrapping

نویسندگان

  • Qi Zhang
  • Xipeng Qiu
  • Xuanjing Huang
  • Lide Wu
چکیده

Bootstrapping has been received a amount of attentions in many fields and achieved good results. While semantic lexicons also have been proved to be useful for many natural language processing tasks. This paper presents an approach to learn semantic lexicons using a new bootstrapping method which is based on Graph Mutual Reinforcement. The approach uses only unlabeled data and a few of seed words to learn new words for each semantic category. Different with other bootstrapping methods, we use Graph Mutual Reinforcement based Bootstrapping to sort the candidate words and patterns. Experimental results show that GMR-Bootstrapping outperforms the state-of-the-art algorithms both in in-domain data and out-domain data. Furthermore it is also shows that the result was depended on not only the size of the corpus, but also the quality.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Bootstrapping Method for Learning Semantic Lexicons using Extraction Pattern Contexts

This paper describes a bootstrapping algorithm called Basilisk that learns highquality semantic lexicons for multiple categories. Basilisk begins with an unannotated corpus and seed words for each semantic category, which are then bootstrapped to learn new words for each category. Basilisk hypothesizes the semantic class of a word based on collective information over a large body of extraction ...

متن کامل

Weighted Mutual Exclusion Bootstrapping for Domain Independent Lexicon and Template Acquisition

We present the Weighted Mutual Exclusion Bootstrapping (WMEB) algorithm for simultaneously extracting precise semantic lexicons and templates for multiple categories. WMEB is capable of extracting larger lexicons with higher precision than previous techniques, successfully reducing semantic drift by incorporating new weighting functions and a cumulative template pool while still enforcing mutua...

متن کامل

Relation Guided Bootstrapping of Semantic Lexicons

State-of-the-art bootstrapping systems rely on expert-crafted semantic constraints such as negative categories to reduce semantic drift. Unfortunately, their use introduces a substantial amount of supervised knowledge. We present the Relation Guided Bootstrapping (RGB) algorithm, which simultaneously extracts lexicons and open relationships to guide lexicon growth and reduce semantic drift. Thi...

متن کامل

Semantic Bootstrapping with a Cluster-Based Extension to DIPRE

The practical applications of information extraction are currently limited by the need to hand-construct search patterns and lexicons and / or to have available large labelled training sets. To address this issue, we present a semantic bootstrapping technique based on Brin’s DIPRE algorithm. The basic algorithm is extended by using clustering to group similar occurrences when extracting new pat...

متن کامل

Morpho-syntactic Lexicon Generation Using Graph-based Semi-supervised Learning

Morpho-syntactic lexicons provide information about the morphological and syntactic roles of words in a language. Such lexicons are not available for all languages and even when available, their coverage can be limited. We present a graph-based semi-supervised learning method that uses the morphological, syntactic and semantic relations between words to automatically construct wide coverage lex...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008